Future directions in learning to rank

نویسندگان

  • Olivier Chapelle
  • Yi Chang
  • Tie-Yan Liu
چکیده

The results of the learning to rank challenge showed that the quality of the predictions from the top competitors are very close from each other. This raises a question: is learning to rank a solved problem? On the on hand, it is likely that only small incremental progress can be made in the “core” and traditional problematics of learning to rank. The challenge was set in this standard learning to rank scenario: optimize a ranking measure on a test set. But on the other hand, there are a lot of related questions and settings in learning to rank that have not been yet fully explored. We review some of them in this paper and hope that researchers interested in learning to rank will try to answer these challenging and exciting research questions. 1. Learning Theory for Ranking Many learning to rank algorithms have been shown effective through benchmark experiments. However, sometimes benchmark experiments are not as reliable as expected due to the small scales of the training and test data. In this situation, a theory is needed to guarantee the performance of an algorithm on infinite unseen data. Statistical learning theory, specifically the generalization theory, investigates the bound between the risk on the finite training data and the risk on infinite test data. In learning to rank, the risk on the training data is defined with a surrogate loss function (e.g., the pairwise losses in Ranking SVM (Joachims, 2002) and RankBoost (Freund et al., 2003)), while the risk on the test set is measured by a ranking measure (e.g., 1-NDCG or 1-MAP). Therefore, to obtain a generalization bound in this setting, we need to address the following issues: (i) a reasonable assumption on the data generation (e.g., queries and documents), (ii) a generalization bound regarding the surrogate loss function; (iii) the relationship between the surrogate loss function and the ranking measure; (iv) and the existence of the limit of the ranking measure when the number of documents approaches infinity. As for these issues, there have been a number of attempts but still a large open space to explore. • Assumption on data generation. In (Agarwal and Niyogi, 2005; Clemencon and Vayatis, 2007) it is assumed that the documents in the training data are sampled in an i.i.d. manner, no matter which queries they are associated with. However, it is c © 2011 O. Chapelle, Y. Chang & T.-Y. Liu.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

ESEE 2017 Transformative learning: new directions in agricultural extension and education

ESEE 2015, Chania, GreeceConference dates: July 4 – 7, 2017Conference Theme: Transformative learning: new directions in agricultural extension and educationThe 23rd European Seminar on Extension (and) Education (ESEE) will be held in the MediterraneanAgronomic Institute of Chania, and hosted by the Lab. of Agricultural Extension, Rural Systems &Rural Sociology, Dept. of Agricultural Economics &...

متن کامل

Learning to Rank

In this tutorial I will introduce ‘learning to rank’, a machine learning technology on constructing a model for ranking objects using training data. I will first explain the problem formulation of learning to rank, and relations between learning to rank and the other learning tasks. I will then describe learning to rank methods developed in recent years, including pointwise, pairwise, and listw...

متن کامل

LETOR: A Benchmark Collection LETOR: A Benchmark Collection for Learning to Rank for Information Retrieval

Learning to rank has attracted great attention recently in both information retrieval and machine learning communities. However, the lack of public dataset had stood in its way until the LETOR benchmark dataset (actually a group of three datasets) was released in the SIGIR 2007 workshop on Learning to Rank for Information Retrieval (LR4IR 2007). Since then, this dataset has been widely used in ...

متن کامل

Association Rule Mining with enhancing List Level Storage for Web Logs: A Survey

Storing and calculating web page rank is a crucial research area. There are also several researches are going on but the need of betterment it still there because of the following reasons: 1) Impact should be calculated cumulative it is not based on the single page rank 2) Automatic rank identification 3) Way of storage. So our study mainly focuses on the above three directions. Our study analy...

متن کامل

Effective Learning to Rank Persian Web Content

Persian language is one of the most widely used languages in the Web environment. Hence, the Persian Web includes invaluable information that is required to be retrieved effectively. Similar to other languages, ranking algorithms for the Persian Web content, deal with different challenges, such as applicability issues in real-world situations as well as the lack of user modeling. CF-Rank, as a ...

متن کامل

Market Orientation, Social Entrepreneurial Orientation, and Organizational Performance: The Mediating Role of Learning Orientation

One of the emerging research areas in the strategic orientation is how to transfer different orientations from the commercial sector to the non-profit sector. Therefore, the objective of this study is to determine the mediating effect of Learning Orientation on the Market Orientation, Social Entrepreneurial Orientation, and Organizational Performance in the non-profit sector. The data from more...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011